skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Search for: All records

Creators/Authors contains: "Chen, Sitan"

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. We develop theory to understand an intriguing property of diffusion models for image generation that we term critical windows. Empirically, it has been observed that there are narrow time intervals in sampling during which particular features of the final image emerge, e.g. the image class or background color (Ho et al., 2020b; Meng et al., 2022; Choi et al., 2022; Raya & Ambrogioni, 2023; Georgiev et al., 2023; Sclocchi et al., 2024; Biroli et al., 2024). While this is advantageous for interpretability as it implies one can localize properties of the generation to a small segment of the trajectory, it seems at odds with the continuous nature of the diffusion. We propose a formal framework for studying these windows and show that for data coming from a mixture of strongly log-concave densities, these windows can be provably bounded in terms of certain measures of inter- and intra-group separation. We also instantiate these bounds for concrete examples like well-conditioned Gaussian mixtures. Finally, we use our bounds to give a rigorous interpretation of diffusion models as hierarchical samplers that progressively "decide" output features over a discrete sequence of times. We validate our bounds with synthetic experiments. Additionally, preliminary experiments on Stable Diffusion suggest critical windows may serve as a useful tool for diagnosing fairness and privacy violations in real-world diffusion models. 
    more » « less
  2. The authors provide the first tight sample complexity bounds for shadow tomography and classical shadows in the regime where the target error is below some sufficiently small inverse polynomial in the dimension of the Hilbert space. Specifically, they present a protocol that, given any π‘š ∈ 𝑁 m∈N and πœ– ≀ 𝑂 ( 𝑑 βˆ’ 1 / 2 ) ϡ≀O(d βˆ’1/2 ), measures 𝑂 ( log ⁑ ( π‘š ) / πœ– 2 ) O(log(m)/Ο΅ 2 ) copies of an unknown mixed state 𝜌 ∈ 𝐢 𝑑 Γ— 𝑑 ρ∈C dΓ—d and outputs a classical description of 𝜌 ρ. This description can then be used to estimate any collection of π‘š m observables to within additive accuracy πœ– Ο΅. Previously, even for the simpler case of shadow tomography where observables are known in advance, the best known rates either scaled benignly but suboptimally in all of π‘š , 𝑑 , πœ– m,d,Ο΅, or scaled optimally in πœ– , π‘š Ο΅,m but included additional polynomial factors in 𝑑 d. Interestingly, the authors also show via dimensionality reduction that one can rescale πœ– Ο΅ and 𝑑 d to reduce to the regime where πœ– ≀ 𝑂 ( 𝑑 βˆ’ 1 / 2 ) ϡ≀O(d βˆ’1/2 ). Their algorithm draws on representation-theoretic tools developed in the context of full state tomography. 
    more » « less
  3. We investigate the problem of predicting the output behavior of unknown quantum channels. Given query access to an n-qubit channel E and an observable O, we aim to learn the mapping ρ↦Tr(OE[ρ]) to within a small error for most ρ sampled from a distribution D. Previously, Huang, Chen, and Preskill proved a surprising result that even if E is arbitrary, this task can be solved in time roughly nO(log(1/Ο΅)), where Ο΅ is the target prediction error. However, their guarantee applied only to input distributions D invariant under all single-qubit Clifford gates, and their algorithm fails for important cases such as general product distributions over product states ρ. In this work, we propose a new approach that achieves accurate prediction over essentially any product distribution D, provided it is not "classical" in which case there is a trivial exponential lower bound. Our method employs a "biased Pauli analysis," analogous to classical biased Fourier analysis. Implementing this approach requires overcoming several challenges unique to the quantum setting, including the lack of a basis with appropriate orthogonality properties. The techniques we develop to address these issues may have broader applications in quantum information. 
    more » « less
  4. Recent works have shown that diffusion models can learn essentially any distribution provided one can perform score estimation. Yet it remains poorly understood under what settings score estimation is possible, let alone when practical gradient-based algorithms for this task can provably succeed. In this work, we give the first provably efficient results along these lines for one of the most fundamental distribution families, Gaussian mixture models. We prove that gradient descent on the denoising diffusion probabilistic model (DDPM) objective can efficiently recover the ground truth parameters of the mixture model in the following two settings: 1) We show gradient descent with random initialization learns mixtures of two spherical Gaussians in d dimensions with 1/poly(d)-separated centers. 2) We show gradient descent with a warm start learns mixtures of K spherical Gaussians with Ξ©(log(min(K,d)))-separated centers. A key ingredient in our proofs is a new connection between score-based methods and two other approaches to distribution learning, the EM algorithm and spectral methods 
    more » « less
  5. The recent proliferation of NISQ devices has made it imperative to understand their power. In this work, we define and study the complexity class , which encapsulates problems that can be efficiently solved by a classical computer with access to noisy quantum circuits. We establish super-polynomial separations in the complexity among classical computation, , and fault-tolerant quantum computation to solve some problems based on modifications of Simon’s problems. We then consider the power of for three well-studied problems. For unstructured search, we prove that cannot achieve a Grover-like quadratic speedup over classical computers. For the Bernstein-Vazirani problem, we show that only needs a number of queries logarithmic in what is required for classical computers. Finally, for a quantum state learning problem, we prove that is exponentially weaker than classical computers with access to noiseless constant-depth quantum circuits. 
    more » « less